Plagiarism Detector

Plagiarism Detection

Project Overview

In this project, you will be tasked with building a plagiarism detector that examines a text file and performs binary classification; labeling that file as either plagiarized or not, depending on how similar the text file is to a provided source text.

Getting the Project Materials

You have been given starting notebooks in a Github repository, linked below.

Since this project uses SageMaker, it is suggested that you create a new SageMaker notebook instance using your AWS console and link it to the Github repository https://github.com/udacity/ML_SageMaker_Studies .

The project files are in the Project_Plagiarism_Detection directory.

You should complete each exercise and question; your project will be evaluated against this rubric .

Submission and Project Evaluation

You will be graded on your implementation of a plagiarism detector as well as complete answers to any questions in the project notebook. You'll submit a zip file or Github repo that includes complete notebooks, with all cells executed, and you'll be graded according to the project rubric.

  1. Ensure you've passed all the unit tests in the notebook.
  2. Ensure you pass all points on the rubric .
  3. When you're done with the project, please save the last two notebooks, "2_Plagiarism_Feature_Engineering.ipynb" and "3_Training_a_Model.ipynb", making sure you've run all cells. You can do this by going to the File menu in the notebook and choosing "Download".
  4. Package "2_Plagiarism_Feature_Engineering.ipynb", "3_Training_a_Model.ipynb", "problem_unittests.py", "helpers.py", and at least one , complete training directory ( source_sklearn or source_pytorch ) into a zip archive, or push the files from your GitHub repo.

Note that the zip file should not include the large data or any generated data repositories; this would make your submission too large to download and manage.

Once you have all of your project files in one zip file, you click Submit and follow the instructions for uploading your complete project!

Project Submission Checklist

Before submitting your project, please review and confirm the following items.
I am confident all rubric items have been met and my project will pass as submitted.
Project builds correctly without errors and runs.
All required functionality exists and my project behaves as expected per the project's specifications.

Once you have checked all these items, you are ready to submit!